Strep suis annotations
======================
Go to http://bacteria.ensembl.org/biomart/martview
and select the appropriate Organism database and select the following attributes and click Result and output the file 'mart_export.txt' in TSV format

Ensembl Gene ID
Ensembl Transcript ID
Chromosome/plasmid
Gene Start (bp)
Gene End (bp)
Strand
Transcript Start (bp)
Transcript End (bp

genes.txt is created by running the following in-line perl script:
tail +2 mart_export.txt | perl -aF/\\t/ -ne 'chomp($F[7]);print join("\t",@F[0,1],"chr".$F[2],$F[5] =~ /^-/ ? "-" : "+",@F[3,4,6,7],1,$F[6].",",$F[7].","),"\n"' > genes.txt  

cytoband.txt is created with "gneg" for all bands with size of each chr equal to the strain's genome length. The chromosome names have to match the ones in the array file. The chromosome names in genes.txt should also be modified to match the chromosome names in the cytoband.txt file.

annotations.txt is obtained by selecting the attributes below and running the perl script below on the downloaded TSV format file mart_export.txt using:
perl the_script_below.pl mart_export.txt > annotations.txt

Ensembl Gene ID
Associated Gene ID
Description
GO Term Name (bp)
GO Term Name
GO Term Name (mf)
EntrezGene ID

PERL script
===========
use strict;
my %annos;
while (<>)
{
	/Ensembl Gene ID/ and next;
	chomp; s/\cM|\cJ//g;
	my ($id, $name, $desc, $bp, $cc, $mf, $locus) = split /\t/;
	s/,/;/ for ($bp, $cc, $mf);
	$annos{$id}{id} = [$desc, $name, $locus];
	$annos{$id}{bp}{$bp}++;
	$annos{$id}{cc}{$cc}++;
	$annos{$id}{mf}{$mf}++;
}
print "Symbol\tName\tDescription\tBiological process\tCellular component\tMolecular function\tLocusLink ID\tOther Aliases\n";
for my $id (sort keys %annos)
{
	my ($desc, $name, $locus) = @{$annos{$id}{id}};
	print join("\t",$id,$name,$desc,(map {join(", ",sort keys %{$annos{$id}{$_}})} qw(bp cc mf)),$locus,$name),"\n";
}